Online Experiments: Practical Lessons
نویسندگان
چکیده
F rom ancient times through the 19th century, physicians used bloodletting to treat acne, ca ncer, diabetes, jaundice, plague, and hundreds of other diseases and ailments (D. Wooton, Doctors Doing Harm since Hippocrates, Oxford Univ. Press, 2006). It was judged most effective to bleed patients while they were sitting upright or standing erect, and blood was often removed until the patient fainted. On 12 December 1799, 67-year-old President George Washington rode his horse in heavy snowfall to inspect his plantation at Mount Vernon. A day later, he was in respiratory distress and his doctors extracted nearly half of his blood over 10 hours, causing anemia and hypotension; he died that night. Today, we know that bloodletting is unhelpful because in 1828 a Pari-sian doctor named Pierre Louis did a controlled experiment. He treated 78 people suffering from pneumonia with early and frequent bloodlet-ting or less aggressive measures and found that bloodletting didn't help survival rates or recovery times. Having roots in agriculture and medicine, controlled experiments have spread into the online world of websites and services. In an earlier Web Technologies article (R. Platform team introduced basic practices of good online experimentation. Three years later and having run hundreds of experiments on more than 20 websites, including some of the world's largest, like msn.com and bing.com, we have learned some important practical lessons about the limitations of standard statistical formulas and about data traps. These lessons, even for seemingly simple univariate experiments, aren't taught in Statistics 101. After reading this article, we hope you'll have better negative introspection: to know what you don't know. In an online controlled experiment , users are randomly assigned to two or more groups for some period of time and exposed to different variants of the website. The most common online experiment, the A/B test, has two variants: the A version of the site is the control and the B version is the treatment. The experimenters define an overall evaluation criterion (OEC) and compute a statistic—for example , the mean of the OEC—for each variant. The OEC statistic is also referred to as a key performance indicator (KPI); in statistics, the OEC is often called the response or dependent variable. The difference between the OEC statistic for the treatment and control groups is the treatment effect. If the experiment was designed and executed properly, the only thing consistently different between the two variants is the …
منابع مشابه
Online Field Experiments: Lessons from CommunityLab
We report briefly on a set of online field experiments conducted as part of the CommunityLab collaborative research project. Based on these projects, and the published research literature, we present an analysis of the design choices for online field experiments and report on lessons learned.
متن کاملEvaluation and comparison the results comprehensive Exam and the mean scores of Basic sciences courses of Isfahan medical students before and after the changes of basic science courses
Introduction: The aim of this study is the evaluation of medical students’ academic achievement after the changes in arrangement and courses of some Basic sciences lessons in school of medicine and comparing their academic achievement before and after the changes in arrangement and the courses. Methods: In this descriptive analytical study 156 samples were selected from 2004 (group 1) and 2005...
متن کاملLearning Analytics and Educational Games: Lessons Learned from Practical Experience
Learning Analytics (LA) is an emerging discipline focused on obtaining information by analyzing students’ interactions with on-line educational contents. Data is usually collected from online activities such as forums or virtualized courses hosted on Learning Management Systems (e.g. Moodle). Educational games are emerging as a popular type of e-learning content and their high interactivity mak...
متن کاملPractical Lessons from the Small Bowel Bleeding Lesions: A Case Report on Small Bowel Cavernous Hemangioma
متن کامل
TweetGenie: Development, Evaluation, and Lessons Learned
TweetGenie is an online demo that infers the gender and age of Twitter users based on their tweets. TweetGenie was able to attract thousands of visitors. We collected data by asking feedback from visitors and launching an online game. In this paper, we describe the development of TweetGenie and evaluate the demo based on the received feedback and manual annotation. We also reflect on practical ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Computer
دوره 43 شماره
صفحات -
تاریخ انتشار 2010